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(57) Abstract 

Methods for detecting protease inhibitors are disclosed. Also described are DNA constructs and host cells transformed 
with these constructs for use in the subject methods. The methods utilize a host cell which exhibits a negative phenotype depend- 
ent on the activity of a given protease. Thus, inhibition of the protease confers a selectable phenotype on the cell. The negative 
phenotype can be conferred by either protease-mediated activation or inactivation of a protein conferring a selectable phenotype. 
The inhibitor is detected by transforming host cells expressing the genes for the selectable phenotype and a given protease with 
random peptide sequences. Inhibitors so identified can be used either directly or indirectly in the treatment of protease-dependent 
disorders. 
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METHOD FOR THE IDENTIFICATION OP PROTEASE INHIBITORS 
Technical Field 

The instant invention relates generally tq> the 
identification of protease inhibitors . More 
10 particularly, the invention relates to methods of 

identifying viral protease inhibitors which can in turn 
be used to treat or prevent viral infection. 

Background 

15 Proteases are enzymes that cleave peptide 

bonds, thereby altering proteins* Besides degrading 
proteins, these enzymes play a regulatory role in a 
variety of physiological processes. Proteases fall into 
four general classes: serine, cysteine, aspartic acid, 
20 and metalloproteases . These classes are distinguished 
primarily by mechanism (Dunn, B.M. , in Proteolytic 
Enzymes . R.J. Geynon and J.S. Bond, eds. , IRL Press, 
Oxford, 1989, pp. 57*82) . Serine and cysteine proteases 
have almost identical two-step mechanisms with an acyl- 
25 enzyme intermediate. Together they comprise the majority 
of the known proteases. Aspartic and metallo-proteases 
catalyze direct hydrolysis of the peptide bond. 

Many procaryotic and eucaryotic proteins are 
synthesized as larger biologically inactive precursors 
30 which become activated only when acted upon by 

endoproteases (proteinases) . These enzymes typically 
recognize specific domains, usually less than ten amino 
acids in length including the sissile bond, in exposed 
loops of generally loose secondary structure (Keil, B., 
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^ ^ttl^ ff i n Pyotei n fiftfluence Analysis, M. Elzinga, ed., 
Humana Press, Clifton, N.J., 1982, pp. 291-304). 

Maturation proteases are responsible for both 
intracellular *and extracellular cleavage of protein 
precursors and many of the proteolytically processed 
proteins in turn play key roles in physiological 
abnormalities which give rise to disease states (Andrews, 
P.C., et al. f B*pgrientia (1987) 12:784-789; Reich, E., 
et al. f £old Soring Harbo r Symposium: Proteases and 



10 Biological control . Cold Spring Harbor Laboratory, Cold 
Spring Harbor, 1975) . Proteins that undergo 
intracellular proteolytic maturation include secreted 
proteins, lysosomal enzymes, mitochondrial proteins and 
membrane proteins. These proteins are highly diverse in 

15 function, having endocrine, neurological, and immune 
functions, as well as acting as growth factors and 
antibiotics. Secreted proteins that undergo 
extracellular proteolytic processing include the plasma 
zymogens involved in blood clotting and the immu ne 

20 complement system. 

Maturation proteases which are indirectly 
involved in human disease are generally distinguished by 
their high degree of substrate specificity. However, a 
host of digestive proteases of lesser specificity are 

25 also known and are more directly involved in diseases 

such as chronic inflammation and tumor metastasis. These 
enzymes include elastases, collagenases , mast cell 
proteases, and extracellular matrix- degrading metallo- 

proteases, among others. 

30 Proteases also play key roles in many 

infectious diseases. An obligatory step in the 
replication of many pathogenic plant and animal viruses 
involves virus -determined proteolytic processing of the 

35 primary viral gene products (Hellen, C.U.T., et al.. 
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Biochemiatrv (1989) 2J&:9881-9890) . Plant viruses which 
encoc^ proteases for this purpose include the 
potyviruses, comoviruses , nepoviruses, sobemoviruses r and 
luteoviruses • These viruses cause economically important 
5 diseases in all major monocot and dicot families. 
Similarly equipped animal viruses include the 
picornaviruses , retroviruses, alphaviruses, f laviviruses , 
pes tiviruses , coronaviruses , and adenoviruses. Diseases 
caused by these viruses include foot-and-mouth disease, 
10 AIDS, the common cold, hepatitis and polio. 

For example, Zucchini Yellow Mosaic Virus 
(ZYMV) , a potyvirus, expresses its genome as a single 350 
kDa polyprotein which is cleaved into at least seven 
mature gene products by three distinct proteolytic 
15 activities. Two of the proteases are virus -encoded 
(Dougherty, W.G., and J.C. Car ring ton, ftiffli Rft v 
Phytopathol . (1988) 2£:123-143; Carrington, J.C. , et al., 
EMBO J. (1990) £:1347-1353) , including the potyviral 49 
kDa protease. This protease is responsible for at least 
20 five of the seven cleavages. This enzyme is a trypsin- 
like cysteine protease which is structurally and 
mechanistically representative of the largest class of 
viral proteases, including those of the a nim a l 
picornaviruses (Dougherty, W.G., et al., Virology (1989) 
25 122:302-310; Bazan, J. P., and R.J. Fletterick, Proc. 

Natl. Acad. 5ei. USA (1988) 7872-7876) . This enzyme 
is highly specific and appears to recognize a region 
comprised of about seven amino acids surrounding the 
sissile bond (Dougherty, W.G., and T.D. Parks, Virology 
30 (1989) 172 .z 145 - 155) . Of the five sites cleaved by this 
enzyme, the two flanking the protease appear to be 
cleaved intramolecularly, while the remaining three 
appear to be cleaved intermolecularly (Garcia, J. A., et 
35 al., J. Gen. Virol, (1990) 71:2773-2779). Of the latter 
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three, the site between the Nib protein and the coat 
protein appears to be the most active. 

p^Jpatire of thf Invention 

5 The invention herein is based on the discovery 

of a unique method for detecting peptide protease 
inhibitors. These inhibitors can be used directly or 
indirectly in the treatment of protease -dependent 
diseases. Alternatively, the inhibitors so identified 

10 can be utilized as structural models for the rational 
design of peptide -mime tics. 

Accordingly, in one embodiment, the subject 
invention is directed to a method for detecting a 
protease inhibitor which comprises: 

15 (a) providing a population of host cells 

expressing a first nucleic acid sequence encoding a 
protease and a second nucleic acid sequence encoding a 
protein capable of conferring a selectable phenotype on 
said host cells dependent on the activity of said 

20 protease; 

(b) providing a pool of nucleic acid 
constructs wherein at least one of the constructs in the 
pool comprises a nucleic acid sequence encoding an 
inhibitor of the protease; 
25 (c) transforming the host cells of (a) with 

the nucleic acid constructs of (b) ; and 

(d) growing the trcuasforaed host cells of (c) 
under conditions that distinguish cells with the 
selectable phenotype, thereby detecting the presence of 

30 the protease inhibitor. 

In another embodiment, the subject invention is 

directed to a DNA construct comprising: 

(a) a first DNA coding sequence for a protein 
35 capable of conferring a selectable phenotype on a host 



SUBSTITUTE SHEET 



WO 93/01305 PCT/US92/05745 



-5* 



cell transformed therewith, the selectable phenotype 
dependent on the activity of a protease; and 

(b) control sequences that are operably linked 
to the first and second coding sequences whereby the 
5 coding sequences can be transcribed and translated in a 
host cell, and at least one of the control sequences is 
heterologous to at least one of the coding sequences. 

In an alternate embodiment, the DNA construct 
further includes a second DNA coding sequence for the 
10 protease of interest. 

In yet another embodiment, the subject 
invention is directed to host cells stably transformed 
with these DNA constructs. 

These and other embodiments of the subject 
15 invention will readily occur to those of ordinary skill 
in the art in view of the disclosure herein. . 

Brief Description at the Figures 

Figure 1 depicts Protease Inhibitor Selection 
20 System I, as applied to ZYMV protease. 

Figure 2 depicts representative examples of 
Protease Inhibitor Selection System II. 

Figure 3 shows the strategy of cDNA synthesis 

from ZYMV and cloning methods. 
25 Figure 4 shows the nucleotide sequence of the 

ZYMV genome (SEQ ID NO:l). 

Figure 5A shows the organization of the primary 
translation products of pZProe, pZPro7 and placZo-CP. 
Figure 5B depicts the results of immunoblot analysis of 
30 SDS/PAGE separated proteins from £± coli cells harboring 

these plasmids. ^ 

Figure 6 depicts the derivation of the pZPro7, 
pZPro9, pZProlO, pZProll and pZProl2 constructs and the 
35 organization of the primary translation products. The 
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open boxes denote ZYMV 49 kDa protease (Pro) cleavage 
sites. Strep R » streptomycin- resistant . Amp » 
ampicillin- resistant, i.e., transformed. Cfu « colony- 
forming units. NT « not tested. 
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20 
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Detailed Descripti on of the Invention 

The practice of the present invention will 
enploy, unless otherwise indicated, conventional ? 
techniques of molecular biology, microbiology, virology, 
recombinant DNA technology, and immunology, which are 
within the skill of the art. Such techniques are 
explained fully in the literature. Sfifi» e.g. . Sambrook, 

Fritsch & Maniatis, Molecular Cloning: A Laboratory 

Manual . Second Edition (1989); Maniatis, Fritsch & 
Sambrook, Molecular Cloning; A Laboratory Manual (1982); 
dna Cloning . Vols. I and II (D.N. Glover ed. 1985); 
Oligonucleotide Synthesis (M.J. Gait ed. 1984); Nucleic 
Arid Hybridization (B.D. Hames & S.J. Higgins eds 1984); 
Animal Cell Culture (R.K. Freshney ed. 1986); Imobilizeti 

lzvmes (IRL press, 1986); B. Perbal, & 
tide to Molecular Cloning (1984); the series. 
Methods Tn gnzvmologv (S. Colowick and N. Kaplan eds.. 
Academic Press , Inc . ) ; and handbook of Experimental 

Vols. I-IV (D.M. Weir and C.C. Blackwell 
198S, Blackwell Scientific Publications) . 
All patents, patent applications, and 
publications mentioned herein, whether supra or infra, 
are hereby incorporated by reference in their entirety. 




• r 



30 



35 



A. 



In describing the present invention, the 
following terms will be employed, and are intended to be 
defined as indicated below. 
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By "protease 11 is meant an enzyme that cleaves a 
peptide bond. The term includes both endopeptidases 
(also called proteinases) which are proteases that 
hydrolyze internal peptide bonds, and exopeptidases, 
5 which are proteases that cleave either N- or C- terminal 
peptide bonds. Some proteases are highly specific, 
cleaving only between two particular amino acids within a 
particular protein. Other proteases are less specific, 
cleaving between more than one amino acid pair and/ or 

10 cleaving between an amino acid pair in more them one 
location in the same and/or different proteins. 
Exemplary proteases include maturation proteases 
responsible for both intracellular and extracellular 
cleavage of protein precursors, such as secreted 

15 proteins, lysosomal enzymes, mitochondrial proteins, 
membrane proteins, plasma zymogens, digestive enzymes, 
elastases, collagenases , mast cell proteases, 
extracellular matrix- degrading metalloproteinases; plant 

* • 

viral proteases such as proteases from potyviruses, 

20 comoviruses, nepoviruses, sobemoviruses , and 

lut eoviruses ; and animal viral proteases such as 
proteases from picornaviruses , retroviruses, 
alphaviruses , f laviviruses , pestiviruses, coronaviruses , 
and adenoviruses* 

25 By "protease inhibitor** is meant a molecule 

capable of altering the activity of a protease such that 
the protease is unable to completely hydrolyze a peptide 
bond for which it is specific. Protease inhibitors can 
be peptides composed solely of genetically encodable 

30 amino acids. "Protease inhibitor" also encompasses 

synthetic peptide derivatives such as peptide aldehydes 
and ketones, peptide boronic acids, peptide chloromethyl 
ketones, azapeptides, peptide hydroxamic acids, and 

35 peptide thiols. "Protease inhibitor" also encompasses 
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synthetic nonpeptide compounds such as 
diisv-propylphosphof luoridate , sulf onyl fluorides , 
phosphoramidon, and halamethylcoumarins . For a detailed 
discussion of protease inhibitors, see Proteinas e 
5 Tnl r* M tera • A - J - Barrett and G - Salvesen, eds. (Elsevier, 
Amsterdam, 1986) . 

The terms "peptide" and "protein" are used in 
their broadest sense, i.e., any polymer of genetically 
encodable amino acids (dipeptide or greater) linked 
10 through peptide bonds. Thus, the terms include 

oligopeptides, polypeptides, protein fragments, muteins, 
fusion proteins and the like. 

A "host cell" is a cell which has been 
transformed, or is capable of transformation, by an 
15 exogenous nucleotide sequence. As described more fully 
below, host cells for use in the present invention may be 
either procaryote or eucaryote, depending on the specific 
protease in question and the selection system desired. 
In general, bacterial cells (either gram- negative or 
gram- positive) are the hosts of choice when the protease 
and its dependent phenotype can be expressed in active 
form in these cells. Eucaryotic cells can be used, 
however, in cases where either the protease or its 
dependent phenotype can be adequately expressed only in 
25 such cells, such as cases in which certain types of 

transport, metabolism, or post-translational modification 
are required. Eucaryotic cells can also be used to 
select inhibitors of other types of biological activities 
which can be expressed only in such cells, such as a ni mal 
30 virus replication. One skilled in the art can readily 

determine an appropriate host cell for use in the present 
invention. 

A "replicon" is any genetic element (e.g., 
35 plasmid, chromosome, virus) that functions as an 



20 



SUBSTITUTE SHEET 



WO 93/01305 



PCT/US92/05745 



autonomous unit of DMA replication; i.e., capable of 
replication under its own control. 

A "vector" is a replicon, such as a plasmid, 
phage, or cosmid, to which another DNA segment may be at- 
5 tached so as to bring about the replication of the at- 
tached segment. 

A "double -stranded DNA molecule" refers to the 
polymeric form of deoxyribonucleotides (bases adenine, 
guanine,, thymine, or cytosine) in a double- stranded 
10 helix, both relaxed and supercoiled. This term refers 
only to the primary and secondary structure of the 
molecule, and does not limit it to any particular 
tertiary forms. Thus, this term includes double- stranded 
DNA found, inter alia , in linear DNA molecules (e.g., 
15 restriction fragments), viruses, plasmids, and 

chromosomes. In discussing the structure of particular 
double -stranded DNA molecules, sequences may be described 
herein according to the normal convention of giving only 
the sequence in the 5 ' to 3' direction- along the 
20 nontranscribed strand of DNA (i.e., the strand having the 
sequence homologous to the mRNA) . 

A DNA "coding sequence" or a "nucleotide 
sequence encoding" a particular protein, is a DNA 
sequence which is transcribed and translated into a 
25 polypeptide in vivo when placed under the control of 

appropriate regulatory sequences. The boundaries of the 
coding sequence are determined by a start codon at the 5' 
(amino) terminus and a translation stop codon at the 3' 
(carboxy) terminus. A coding sequence can include, but 
30 is not limited to, procaryotic sequences, cDNA from 

eucaryotic mRNA, genomic DNA sequences from eucaryotic 
(e.g., mammalian) DNA, and even synthetic DNA sequences. 
A transcription termination sequence will usually be 
located 3' to the coding sequence. 
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A "promoter sequence" is a DNA regulatory 
region capable of binding RNA polymerase in a cell and 
initiating transcription of a downstream (3' direction) 
coding sequence. For purposes of defining the present 
5 invention, the promoter sequence is bound at the 3' 

terminus by the translation start codon (ATG) of a coding 
sequence and extends upstream (5' direction) to include 
the minimum number of bases or elements necessary to 
initiate transcription at levels detectable above 

10 background. Within the promoter sequence will be found a 
transcription initiation site (conveniently defined by 
mapping with nuclease SI) , as well as protein binding 
domains (consensus sequences) responsible for the binding 
of RNA polymerase. Eucaryotic promoters will often, but 

15 not always, contain "TATA" boxes and "CAT" boxes. 

Procaryotic promoters contain Shine -Dalgarno sequences in 
addition to the -10 and -35 consensus sequences. 

DNA "control sequences" refers collectively to 
promoter sequences, ribosome binding sites, 

20 polyadenylation signals, transcription termination 

sequences, upstream regulatory domains, enhancers, and 
the like, which collectively provide for the 
transcription and translation of a coding sequence in a 
host cell. 

25 A coding sequence is "operably linked to" 

another coding sequence when RNA polymerase will 
transcribe the two coding sequences into mRNA, which is 
then translated into a chimeric polypeptide encoded by 
the two coding sequences. The coding sequences need not 

30 be contiguous to one another so long as the transcribed 
sequence is ultimately processed to produce the desired 
chimeric protein. 

A control sequence "directs the transcription" 

35 of a coding sequence in a cell when RNA polymerase will 
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bind the promoter sequence and transcribe the coding 
sequence into mRNA, which is then translated into the 
polypeptide encoded by the coding sequence. 

A cell has been "transformed" by an exogenous 
5 nucleotide sequence when the sequence has been introduced 
inside the cell membrane. An exogenous nucleotide 
sequence may or may not be integrated (covalently linked) 
to chromosomal nucleic acid making up the genome of ?the 
cell. In procaryotes and yeasts, for example, the 

10 exogenous nucleotide sequence may be maintained on an 
episomal element, such as a plasmid. With respect to 
most other eucaryotic cells, a stably transformed cell is 
one in which the exogenous nucleotide sequence has become 
integrated into the chromosome so that it is inherited by 

15 daughter cells through chromosome replication. This 
stability is demonstrated by the ability of the 
eucaryotic cell to establish cell lines or clones 
comprised of a population of daughter cell containing the 
exogenous sequence. 

20 A "clone" is a population of cells derived from 

a single cell or common ancestor by mitosis. A "cell 
line" is a clone of a primary cell that is capable of 
stable growth in vitro for many generations. 

A "heterologous" region of a DNA construct is 

25 an identifiable segment of DNA within or attached to 
another DNA molecule that is not found in association 
with the other molecule in nature. Thus, when the 
heterologous region encodes a bacterial gene, the gene 
will usually be flanked by DNA that does not flank the 

30 bacterial gene in the genome of the source bacteria. 

Another example of the heterologous coding sequence is a 
construct where the coding sequence itself is not found 
in nature (e.g., synthetic sequences having codons 

35 

different from the native gene) . Allelic variation or 
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naturally occurring mutational events do not give rise to 
a heterologous region of DNA, as used herein. 

The term "treatment 0 as used herein refers to 
either (i) the prevention of infection or reinfection 
5 (prophylaxis), or (ii) the reduction or elimination of 
symptoms of the disease of interest (therapy) . 

B. toner* 1 Methods 

Described herein is a system which can be used 

10 to select effective inhibitors of proteases from large 

pools of random peptide sequences. The method utilizes a 
known protease which can be obtained through standard 
techniques, i.e. direct isolation, synthesis or 
recombinant technology. The nucleotide sequence of the 

15 protease can be determined and used to transform a host 
cell. The host cell is also transformed with a 
nucleotide sequence encoding a protein that confers a 
negative phenotype on the cell, such as sensitivity to a 
given antibiotic, or inability to grow on a given carbon 

20 source, which is dependent on the activity of the cloned 
protease. Genes for the protease and the protein 
conferring the dependent phenotype are contained on one 
or more constructs which have been introduced into the 
host cell. Thus, inhibition of the protease confers a 

25 selectable phenotype on the cell (e.g., antibiotic 
resistance, or the ability to grow on a given carbon 
source) . Once identified, the particular inhibitor can 
be isolated, sequenced and further used as described 
below. 

30 The negative phenotype may be expressed by 

either of two general mechanisms. In the first, a gene 
conferring a dominant negative phenotype is expressed as 
an inactive precursor protein which is activated by 

35 protease -mediated cleavage at a site which has been 
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engineered to resemble a natural substrate of the 
protease. In the second mechanism, a gene conferring a 
selectable phenotype is inactivated by protease -mediated 
cleavage at a similarly engineered site. 
5 The above- described host cells cam then be used 

to detect effective inhibitors of the protease from large 
pools of random peptides encoded on another plasmid. 
Cells transformed with variants of this plasmid which 
encode effective inhibitors are identified by selection 
10 for the appropriate phenotype. This additional plasmid 
contains a gene which encodes a "carrier" protein in 
which all or part of an exposed domain has been 
randomized with respect to its amino acid sequence. 
Typically, the randomized domain may range from four to 
15 fifteen amino acids in length. The length of the 
randomized amino acid sequence will depend on the 
specific application of the inhibitor and can be readily 
determined by one skilled in the art. For example, 
peptides intended for use for the design of peptide 
20 mimetics will tend to have shorter sequences than 
peptides for use in peptide or gene therapy. 

Part or all of this sequence is randomized with 
respect to the twenty genetically- encodable amino acids. 
Thus, a fully randomized set of heptapeptide sequences 
25 would contain more than 10 9 different peptides. 

Such random sequence "libraries" can be 
constructed by replacing the sequence encoding the 
exposed domain in the "carrier" protein gene with a set 
of synthetic oligodeoxynucleotides of random sequence. A 
30 natural substrate of the protease in question can be 
conveniently used for the "carrier" protein. Alterna- 
tively, one of the many well -characterized natural 
protease inhibitors may be used (Proteinase inhibitors. 
35 A.J. Barrett and G. Salvesen, eds. (Elsevier, Amsterdam, 
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1986) , section C) , in which the amino acid sequence of 
the native binding domain has been randomized. 
Structural constraints placed on the randomized sequence 
by the flanking domains of the "carrier" may be minimized 
by flanking the randomized sequence with short "spacers" 
of polyproline or polyglycine which are highly flexible 

(Creighton, T.E., proteins: Structures a^d Molecular 

, W.H. Freeman, New York, ,1984, ch. 5). 
Some of these "random" peptides will, by 
10 chance, have structures which are capable of binding 
tightly to the active site of the protease, thereby 
preventing it from either activating or inactivating the 
negative phenotype- conferring protein, depending on the 
mechanism employed. This, in turn, will confer the 
15 selectable phenotype on the host cell when transformed 
with these constructs. The structures of effective 
inhibitors can then be determined by sequencing the 
appropriate regions of constructs recovered from such 
phenotype* selected cells. 
20 A representative example of the first mechanism 

described above, i.e., wherein the activation of a 
negative phenotype -conferring protein is inhibited 
(hereinafter referred to as Protease Inhibitor Selection 
System I) as applied to the ZYMV protease is illustrated 
25 in Figure 1. A portion of the ZYMV polyprotein is 

depicted which includes the protease, replicase (Nib), 
and coat protein (see U.S. Patent Application Serial No. 
07/560,130). The arrows indicate substrate sites at 
which the protease cleaves the polyprotein. In this 
30 selection system, either the replicase or coat protein is 
replaced with the coding sequence for the protein 
conferring the negative phenotype. An example of the 
latter includes L coli ribosomal protein S12, which 
35 confers sensitivity to streptomycin on streptomycin- 
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resistant hosts such as fij. coli strain MC1009 (Post, 
L.E., and M. Nomura, J- Biol- Chem. (1980) 2J5£:4660- 
4666) . A transcription repressor protein may also be 
used such as the lactose, tryptophan, or phage lambda 
5 repressors (Lewin, B., Genes IV. Cell Press, Cambridge, 
1990, pp. 240-264). These act by repressing expression 
of antibiotic resistance genes in hosts in which these 
genes are transcribed from repressible promoters. In 
either case the negative phenotype is only displayed when 
10 the negative phenotype protein is being actively cleaved 
out of the polyprotein by the protease. 

For other proteases it may be convenient to 
express the protease and the negative phenotype precursor 
from separate transcription units. The only requirements 
15 are that the negative phenotype protein be linked to an 
extraneous domain by a peptide sequence which* is a 
natural substrate for the protease in question, and that 
this precursor be inactive until cleaved by the protease. 

Figure 2 illustrates the second mech an i sm 
20 wherein the negative phenotype is conferred by protease - 
mediated inactivation of a protein conferring a 
selectable phenotype (referred to herein as Protease 
Inhibitor Selection System II) . Examples of such 
proteins include secreted or membrane proteins which 
25 confer resistance to the antibiotics ampicillin, 
tetracycline, or kanaxnycin (Methods gngymologv. 
vol. 43, Academic Press, New York, 1975), or which confer 
the ability to utilize carbon sources such as lactose or 
maltose (Bieker, K.L., and T.J. Silhavy, Trends in 
30 Genetics (1990) £:329-334) . These proteins are normally 
expressed as precursors in which an amino- ter minal signal 
sequence directs transport of the protein across the cell 
membrane or insertion of the protein into the cell 
35 membrane, after which the signal sequence is 
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proteolytically removed. In these constructs, the 
protease substrate peptide sequence is inserted between 
the signal sequence and the mature protein such that 
cleavage by the protease renders the protein incapable of 
5 membrane transport or insertion and thereby inactive. 
Alternatively, the protease substrate sequence may be 
inserted into a surface domain of the mature protein such 
that cleavage by the protease renders the protein > 
inactive . 

10 A special case of this selection system occurs 

with proteases which are toxic when expressed in E, CQJ4 
by virtue of their fortuitous inactivation of one or more 
host proteins which are required for growth (Baum, E.Z., 

et al., Proe. Natl. Arad. Sci , USA (1990) £7:5573-5577) . 

15 In such cases, the inactivated host protein (s) confer the 
selectable phenotype in the presence of inhibitors of the 
protease . 

The random peptide inhibitor gene library may 
be delivered to the selector cells by any of several 

20 methods, the choice of which will depend to some extent 
on the size of the library. One skilled in the art can 
readily determine an acceptable technique to use with a 
given library. For example, chemical transformation with 
purified plasmid (Sambrook, J. , et al., Molecular 

25 cloning . Cold Spring Harbor Laboratory, 1989, pp. 1.76- 
1.84) can be used for libraries of up to 10 8 -10 9 members, 
depending on the efficiency* Such a library can 
accommodate a complete set of fully randomized 
pentapeptides . High voltage electroporation with 

30 purified plasmid (Dower, W.J., et al., Nucleic Aclflg Reg- 
(1988) 1£:6127-6145) is useful for libraries of lO 10 -!*) 11 
members, nearly sufficient to accommodate a complete set 
of fully randomized heptapeptides . For larger libraries, 

35 
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bacteriophage- derived vectors can be used for delivery by 

transduction. 

For example, a plasmid vector cam be converted 

to a cosmid vector (Sambrook, J., et al., Molecular 

5 Cloning . Cold Spring Harbor Laboratory, 1989, ch. 3) 

simply by insertion of a cos site and an appropriate 

length of "stuffer" DMA. Concatenate ligation of the 

library to such a vector can be followed by efficient 

packaging into phage A pseudovirions using commercially 

10 available preparations. Efficient, large-scale 

transductions of the packaged cosmids into selector cells 
can then be accomplished by established methods. In a 
further refinement, concatemers of the i nh i b itor gene 
library can be used instead of "stuffer" DNA in the 

15 cosmid to achieve the necessary size for packaging. This 
reduces, by an order of magnitude, the number of 
transformants that need to be screened to cover the 



The stringency of selection by these systems 

20 can be adjusted in a variety of ways. A number of 
transcriptional promoters and enhancers of varying 
strengths are available (Sambrook, J., et al.. Molecular 
Cloning , Cold Spring Harbor Laboratory, 1989, ch. 17), 
which can be used with the protease, negative phenotype 

25 precursor, and inhibitor genes to raise or lower the 
inhibitor strength required for selection. Inducible 
promoters can be used, such that their strengths may be 
titrated by adjusting the amount of inducer in the growth 
medium. For example, by having the inhibitor expressed 

30 from an inducible promoter, the potency threshold of a 
pool of selected inhibitors is raised, and the size of 
the pool reduced, simply by reducing the amount of 
inducer present during selection. In addition, such 

35 inhibitor inducibility can be used to counterselect 
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stable false positives, such as revertants that have 
muta-ad the protease gene, simply by replica plating onto 
selective medium in the absence of inducer. Only the 
revertants axe able to grow. 
5 Once detected, the protease inhibitor can be 

isolated and chemically characterized, using known 
techniques. These systems can be used to generate 
inhibitor peptides for any protease which can be 
expressed in active form in a suitable host and for which 

10 substrate cleavage site sequences are known. In addition 
to the proteases of many important plant and animal viral 
pathogens, inhibitors of the proteases of other types of 
microbial pathogens as well as cellular proteases which 
have been implicated in such disorders as rheumatoid 

15 arthritis, Alzheimer's disease, and tumor metastasis, can 
also be identified. 



C. Uae and Ad ministration 

The instant invention can be used to identify 

20 protease inhibitors which in turn are useful in the 

treatment of protease-dependent diseases in both plants 
and animals. The inhibitors can be used directly in 
peptide therapy or can be encoded in a gene and used in 
gene therapy. The identified inhibitors can also serve 

25 as structural models for the rational design of peptide- 
mimetics, that is, synthetic compounds that mimic the 
protease -binding action of the identified protease 
inhibitors to bring reactive groups into contact with the 
protease active site. (See, e.g., Demuth, H.-U., jL. 

30 Bngyme Inhibition (1990) 1:249-278) . The present 

invention also has a more general application in the 
construction and use of in vivo systems for the selection 
of bioactive peptides from peptide libraries. For 

35 example, systems may be designed for the selection of 
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peptide inhibitors of hydrolytic enzymes other than 
proteases. A number of such enzymes are known which can 
hydrolyze natural or artificial substrates to produce one 
or more compounds which are toxic to EL*. SiSlL (for 
5 example, see Hydrolvti c Enzvmeg. A. Neuberger and K. 
Brocklehurst, eds. Elsevier, Amsterdam, 1987), The 
expression of such an enzyme in an appropriate host cell 
allows the selection of peptide inhibitors of the enzyme 
based on their ability to confer viability on the cells 

10 in the presence of toxigenic substrates. 

In general, any phenotype of cultured 
procaryotic or eucaryotic cells which can be altered by 
the endogenous expression of appropriate peptides such 
that cells expressing such peptides can be readily 

15 distinguished and isolated from cells which either do not 
express such peptides or which express peptides which do 
not alter the phenotype, may provide the basis for 
establishing an in vivo system for the selection of 
bioactive peptides from peptide libraries. Among the 

20 most tractable medically important phenotypes will be 
those manifesting susceptibility to itticrobial 
pathogenicity. 

For example, the endogenous expression of a 
random peptide library as an exposed domain of a suitable 

25 stable "carrier" protein in a population of cultured 

mammalian cells of sufficient size to ensure that all or 
most members of the library are represented in the 
population may be used to select peptides which interfere 
with the ability of microbial pathogens such as viruses 

30 or bacteria or their toxins to inhibit cell growth. When 
such cell populations are challenged by such pathogens or 
toxins, only those cells expressing inhibitory peptides 
will grow, allowing the active peptides to be identified 

35 
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by established methods- Such peptides can in turn be 
used for the development of effective therapies. 

For the treatment of plant pathogenesis, the 
identified inhibitors can be used to create transgenic 
5 plants. One commonly used method of gene transfer in 
plants involves the insertion of the gene of interest 
into the T-DNA region of a Ti or Ri plasmid derived from 
ha. tumefaciens or K*. rhigooenea. respectively. Many 
control sequences are known which when coupled to a 

10 heterologous coding sequence and transformed into a host 
organism show fidelity in gene expression with respect to 
tissue/organ specif icity of the original coding sequence. 
See, e.g., Benfey, P.N., and Chua, N.H., Science (1989) 
244 :174-181. Suitable control sequences for use in these 

15 plasmids include promoters for constitutive leaf -specif ic 
expression of the desired gene in the various * target 
plants. Other useful control sequences include a 
promoter and terminator from the nopaline synthase gene 
(NOS) . The NOS promoter and terminator are present in 

20 the plasmid pARC2, available from the American Type 

Culture Collection and designated ATCC 67238. If such a 
system is used, the virulence gene from either the 

Ti or Ri plasmid must also be present,, either along with 
the T-DNA portion, or via a binary system where the xir 

25 gene is present on a separate vector. Such systems, 
vectors for use therein, and methods of transforming 
plant cells are described in U.S. Patent No. 4,658,082, 
and Simpson, R.B., et al., PlflTIt Mq1 - BiQl - ( 1986 > £:*03- 
415, incorporated herein by reference in their entirety. 

30 Once constructed, these plasmids can be placed 

into rhizooenes or JL- Mmefaciens and these vectors 
used to transform cells of plant species which are 
ordinarily susceptible to the particular plant pathogen. 

35 The selection of either ^ rymftfaciens or A*. TThiSPqsneg 
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will depend on the plant being transformed thereby. In 
general, 2L. fcumefaciena is the preferred organism for 
transformation. Most dicotyledons, some gymnosperms, and 
a few monocotyledons (e.g., certain members of the 
5 Liliales and Arales) are susceptible to infection with 
tumef aciens . A. rhizocrenes also has a wide host range, 
embracing most dicots and some gymnosperms, which 
includes members of the Leguminosae , Compositae and?* 
Chenopodiaceae . Alternative techniques which have proven 

10 to be effective in genetically transforming plants 

include particle bombardment and electroporation. See, 
e.g., Rhodes, C.A. , et al., Science (1988) 21^:204-207; 
Shigekawa, K. f and Dower, W.J., BioTechniques (1988) 
5:742-751; Sanford, J.C., et al., Particulate Science and 

15 Technology (1987) J£:27-37; and McCabe, D.E., 
BioTechnolocrv (1988) £:923-926. 

Once transformed, these cells can be used to 
regenerate transgenic plants. For example, whole plants 
can be infected with these vectors by wounding the plant 

20 and then introducing the vector into the wound site. Any 
part of the plant can be wounded, including leaves, stems 
and roots. Alternatively, plant tissue, in the form of 
an explant, such as cotyledonary tissue or leaf disks, 
can be inoculated with these vectors and cultured under 

25 conditions which promote plant regeneration. Roots or 

shoots transformed by inoculation of plant tissue with 2L. 
rhizoaenes or iL. tumef aciens. containing the desired 
gene, can be used as a source of plant tissue to 
regenerate transgenic plants, either via somatic 

30 embryogenesis or organogenesis. Examples of such methods 
for regenerating plant tissue are disclosed in Shahin, 

E.A. , Theor, AppI . Genet, (1985) £2.:235-240; U.S. Patent 
No. 4,658,082; and Simpson et al., supra. 

35 
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The inhibitors identified by the present method 
can also be used in gene therapy. For example, HIV- 
specific protease inhibitor genes, in which a natural 
mammalian protease inhibitor serves as carrier for the 
5 HIV protease inhibitor domain, can be used in anti-AIDS 
gene therapy. Lymphocytes or bone marrow cells from the 
patient can be transformed with the protease inhibitor 
gene in vitro and returned to the patient, where they 
establish an HIV- resistant subpopulation of lymphocytes 
10 which can gradually restore cell -mediated immune function 
as the patient's untransformed lymphocytes are depleted 

by the virus . 

Similarly, proteases active in blood, lymph, or 
cerebro- spinal fluid which are essential components of 

15 disorders such as chronic inflammations, metastatic 

cancers, and certain viral infections, may be targeted by 
protease inhibitor gene therapy, in which the inhibitors 
are secreted by transgenic lymphocytes or other 
transgenic cell implants. 

20 For therapeutic use in animals, the inhibitors 

identified by the present method can be altered by 
established methods to improve their pharmaco- kinetic 
properties. For example, the inhibitors may be 
administered linked to a carrier. For example, a 

25 fragment may be conjugated with a macromolecular carrier. 
Suitable carriers are typically large, slowly metabolized 
macramolecules such as: proteins; polysaccharides, such 
as sepharose, agarose; cellulose, cellulose beads and the 
like; polymeric amino acids such as polyglutamic acid, 

30 polylysine, and the like; amino acid copolymers; and in- 
active virus particles. Especially useful protein 
substrates are serum albumins, keyhole limpet hemocyanin, 
immunoglobulin molecules , thyroglobulin. ovalbumin, and 
35 other proteins well known to those skilled in the art. 
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The protein substrates may be used in their na- 
tive form or their functional group content may be 
modified by, for example, succinylation of lysine 
residues or reaction with Cys-thiolactone. A sulfhydryl 
5 group may also be incorporated into the carrier (or 

inhibitor) by, for example, reaction of amino functions 
with 2 - iminothiolane or the N-hydroxyBUccinimide ester of 
3- (4-dithiopyridyl) propionate. Suitable carriers may 
also be modified to incorporate spacer arms (such as 

10 hexamethylene diamine or other bifunctional molecules of 
similar size) for attachment of peptides. Methods of 
coupling peptides to proteins or cells are known to those 
of skill in the art. 

It is also possible to administer the 

15 inhibitors identified using the instant method alone, or 
mixed with a pharmaceutical ly acceptable vehicle or 
excipient. Typically, the compositions are prepared as 
injectables, either as liquid solutions or suspensions; 
solid forms suitable for solution in, or suspension in, 

20 liquid vehicles prior to injection may also be prepared. 
The preparation may also be emulsified or the active 
ingredient encapsulated in liposome vehicles. The active 
immunogenic ingredient is often mixed with vehicles 
containing excipients which are phaxmaceutically accept - 

25 able and compatible with the active ingredient. Suitable 
vehicles are, for example, water, saline, dextrose, 
glycerol, ethanol, or the like, and combinations thereof. 
In addition, if desired, the vehicle may contain minor 
amounts of auxiliary substances such as wetting or 

30 emulsifying agents, or pH buffering agents. Actual 

methods of preparing such dosage forms are known, or will 
be apparent, to those skilled in the art. Ssa, e.g. , 
Remington's Pharmaceutical Sciences, Mack Publishing 

35 Company, Easton, Pennsylvania, 15th edition, 1975. The 
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camposition or formulation to be administered will, in 
any event, contain a quantity of the inhibitor adequate 
to achieve the desired effect in the individual being 
treated* 

5 . Additional formulations which are suitable for 

other modes of administration include suppositories and, 
in some cases, aerosol, intranasal, oral formulations, 
and sustained release formulations. For suppositories, 
the vehicle composition will include traditional binders 

10 and carriers, such as, polyalkylene glycols, or 

triglycerides* Such suppositories may be formed from 
mixtures containing the active ingredient in the range of 
about 0.5% to about 10% (w/w) , preferably about 1% to 
about 2%. Oral vehicles include such normally employed 

15 excipients as, for example, pharmaceutical grades of 
mannitol, lactose, starch, magnesium, stearate, sodium 
saccharin cellulose, magnesium carbonate, and the like. 
These oral compositions may be taken in the form of 
solutions, suspensions, tablets, pills, capsules, 

20 sustained release formulations, or powders, and contain 
from about 10% to about 95% of the active ingredient, 
preferably about 25% to about 70%. 

Intranasal formulations will usually include 
vehicles that neither cause irritation to the nasal 

25 mucosa nor significantly disturb ciliary function. 

Diluents such as water, aqueous saline or other known 
substances can be employed with the subject invention. 
The nasal formulations may also contain preservatives 
such as, but not limited to, chlorobutanol and 

30 benzalkonium chloride. A surfactant may be present to 
>t enhance absorption of the subject proteins by the nasal 
mucosa . 

Controlled or sustained release formulations 
35 are made by incorporating the inhibitor into carriers or 



SUBSTITUTE SHEET 



WO 93/01305 



PCT/US92/0S745 



-25- 



vehicles such as liposomes, nonresorbable impermeable 
polymers such as ethylenevinyl acetate copolymers and 
Hytrel* copolymers, swellable polymers such as hydrogels, 
or resorbable polymers such as collagen and certain 
5 polyacids or polyesters such as those used to make 

resorbable sutures. The inhibitors can also be delivered 
using implanted mini -pumps, well known in the art. 

Furthermore, the inhibitors (or complexes 
thereof) may be formulated into pharmaceutical compos i- 
10 tions in either neutral or salt forms. Pharmaceutical^ 
acceptable salts include the acid addition salts (formed 
with the free amino groups of the active polypeptides) 
and which are formed with inorganic acids such as, for 
example, hydrochloric or phosphoric acids, or such 
15 organic acids as acetic, oxalic, tartaric, mandelic, and 
the like. Salts formed from free carboxyl groups may 
also be derived from inorganic bases such as, for 
example, sodium, potassium, ammonium, calcium, or ferric 
hydroxides , and such organic bases as isopropylamine , 
20 trimethylamine , 2 - e thy lamino ethanol, histidine, 
procaine, and the like. 

To treat an animal subject, the inhibitor of 
interest is administered parenterally, usually by 
intramuscular injection in an appropriate vehicle. Other 
25 modes of administration, however, such as subcutaneous, 
intravenous injection and intranasal delivery, are also 
acceptable. Injectable formulations will contain an ef- 
fective amount of the active ingredient in a vehicle, the 
exact amount being readily determined by one skilled in 
30 the art. The active ingredient may typically range from 
about 1% to about 95% (w/w) of the composition, or even 
higher or lower if appropriate. The quantity to be 
administered depends on the animal to be treated and the 
35 particular inhibitor used. Effective dosages can be 
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readily established by one of ordinary skill in the art 
through routine trials establishing dose response curves. 
The subject is treated by administration of the 
particular inhibitor, in at least one dose. Moreover, 
5 the subject may be administered as many doses as is 
required to effectively treat the individual. 

Below are examples of specific embodiments for 
carrying out the present invention. The exanples are of- 
fered for illustrative purposes only, and are not 
10 intended to limit the scope of the present invention in 
any way. 

EXAMPLES 

is example l 

The genetic expression of ac tive ZYMV 49 kDa protease in 

This exaxnple describes the construction and 
expression in £^ coli of a gene which encodes a portion 

20 of the ZYMV polyprotein. The primary translation product 
of this gene is a 140 kDa protein which includes the 
49 kDa protease and flanking cleavage sites, a portion of 
the nuclear inclusion 4 b' protein (Nib, also referred to 
as the replicase) , including the Nib/coat protein 

25 cleavage site, followed by the coat protein (CP). 

Evidence is presented showing that the expression of ; this 
gene in coli leads to an accumulation of mature CP as 
a result of efficient cleavage at the Nib/CP cleavage 
site by the 49 kDa protease. 

30 

rPNA Cloning and Seque ncing of the ZYMV Genome 

A Calif ornia isolate of ZYMV was obtained from 
Professor J. A. Dodds of the University of California at 
35 Riverside. The virus was propagated by mechanical 
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inoculation of the cotyledons of ten- day- old Cucurbit a 
peoo cv. early straightneck seedlings. Systemically 
infected leaves were harvested 3-8 weeks after 
inoculation and virus was purified therefrom essentially 
5 as described by Lisa, V. , et al., Phvtopathol (1981) 
21:667-672. The virus was quantified by absorbance at 
260 run using an extinction coefficient of 2*8 ^go/mg/ml. 

Viral genomic RNA was isolated from purified 
virions by digestion with protease K in borate buffer (pH 

10 9) containing 1% SDS and 4 xnM EDTA for one hour at 37°C, 
followed by phenol /chloroform extraction and ethanol 
precipitation. The RNA was redispersed in water, 
quantified by absorbance at 260 run, and analyzed by 
agarose gel electrophoresis in the presence of methyl 

15 mercuric hydroxide. 

DNAfl complementary to ZYMV RNA were synthesized 
essentially according to Gubler, U. f et al. Gene (1983) 
2£: 263 -269, as described in the technical manual for the 
Riboclone cDNA Synthesis Kit (Promega Corp.). Figure 3 

20 shows an outline of this procedure. The first strand was 
synthesized using AMV reverse transcriptase and an 
oligodeoxythymidylate primer. After second strand 
synthesis EcoRI linkers were added, digested, and the 
cONAs were ligated into the EcoRI site of pBluescript 

25 (Stratagene, Inc.). The ligation product was then used 
to transform competent JL. coli XL-l Blue cells 
(Stratagene) which were then plated in the presence of 
lac inducer (IPTG) and substrate (X-gal) for color 
selection of recombinants. Plasmid DNA was isolated from 

30 colorless clones by the alkaline lysis miniprep method 
( Molecular Cloni ng, a la boratory Manual . 2d Ed., J. 
Saznbrook, E. Fritsch, and T. Maniatis, eds.. Cold Spring 
Harbor Press, New York, 1989) and insert sizes were 

35 
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estimated after digestion with EcoRI by agarose gel 
electrophoresis in the presence of ethidium bromide. 

PZRl, the largest cDNA clone obtained from the 
first experiment, had a 2.3 kb insert, the ends of which 
5 were sequenced using the Sanger dideoxy chain- terminating 
method as described in the product literature for the 
Sequenase 2 sequencing kit (United States Biologicals) 
with the M13 universal and reverse primers encoded on 
either end of the multiple cloning site in pBluescript. 

10 The orientation of the insert relative to the viral 

genome and the multiple cloning site was indicated by the 
appearance of the polyadenylate tract from the 3' end of 
the genome in the sequence from the reverse primer. The 
remainder of the clone was sequenced stepwise, 200*300 

15 nucleotides at a time, in both directions from synthetic 
oligodeoxynucleotide primers complementary to the distal 
ends of each of the successive sequencing runs. Sequence 
data were processed and analyzed on a DEC VAX 11/750 
minicomputer. 

20 The second round of cONA cloning was 

accomplished in the same manner as the first except that 
a synthetic oligodeoxynucleotide complementary to the 5' 
end of pZRl was used as primer and the cDNAs were ligated 
directly, without linkers, into the EcoRV site of 

25 pBluescript (Figure 3) . Prom this cloning two clones 

were obtained, pZBll and pZB60, which had inserts of 2.3 
kb and 3.8 kb, respectively. For the sequencing of 
pZBll, nested deletions were prepared from each end of 
the insert according to Henikoff, S., SffiQfi (1984) 

30 2fi:357ff as described in the product literature for the 
Erase-a-base System kit (Pramega Corp.). For each 
direction, approximately twenty- four clones containing 
deletions spanning the entire length of the insert were 

35 sequenced simultaneously from the M13 universal or 
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re verse primers. Any gaps left by failure of the 
sequences of adjacent time points to overlap were filled 
in using synthetic oligodeoxynucleotide primers made from 
the sequence near the 5 ' end of the gap . 
5 Clone pZB60 was sequenced in both directions 

from nested deletions in the same manner as for pZBll 
except that a unique Ncol site within pZBll was used as 
the starting point for deletions in the 5' direction and 
only those clones with deletions mapping between the 5' 

10 end of pZBll and the 5' end of pZB60 were sequenced. 

A third round of cDNA cloning was conducted as 
described above for the preparation of pZBll and pZB60 
except that a synthetic oligomer complementary to the 5' 
end of pZB60 was used as a primer. From this round, 

15 pZF18, having an insert of 3.7 kb was obtained and 
sequenced as described above for pZBll and pZB60. 

The 5' end of the viral RNA sequence was 
determined by reverse transcription of purified viral RNA 
using a synthetic oligonucleotide primer complementary to 

20 nucleotides 76-99 at the 5' end of pZF18. The Sanger 
dideoxynucleotide chain- terminating method was used 
essentially as described in the Promega Gem Seq manual 
{Pr omega Corp.) . 

The continuous open reading frame of the viral 

25 genome was identified with the aid of a computer as 

described above. The coding sequences of the functional 
ZYMV gene products were identified by amino acid sequence 
homology to those of other potyviruses (Allison, R. , et 
al., Virology (1986) 154:9-20; Damier, L.L., et al., 

30 Nucleic Acidfl Res (1986) 14:5417-5430; Robaglia, C, et 
al., J Gen Virol (1989) 7ft:935-947; Maiss^ E. , et al., £ 
Gen Virol (1989) 7fi:513-524) . The identity of the coat 
protein gene was further confirmed by subcloning the 

35 presumptive coding sequence into a modified version of 
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pBluescript from which the gene could be expressed in 
vitro . vitro translation of the gene produced a 
product of the expected size which reacted specifically 
with antiserum raised against purified ZYMV coat protein 
5 when analyzed by Western blotting. 

Figure 4 shows the nucleotide sequence of the 
ZYMV genome as determined above along with the deduced 
amino acid sequence. The nucleotide sequence is numbered 
from the S' terminus. The 5' non- coding region extends 
10 from nucleotide 1 to nucleotide 139. Nucleotides 140-142 
initiate the polyprotein coding sequence with a 
methionine codon in a consensus translation initiation 
context (Joshi, CP., w,irl»ir Acids Res (1987) 1£:6643- 
6653). By homology with the potyviral polyprotein 
IS sequences cited above, the cleavage site between the 

aphid transmission helper component (HC) and the 46 kDa 
protein is believed to occur between the glycine at codon 
766 (nucleotides 2435-2437) and the glycine at codon 767 
(nucleotides 2438-2440) . The cleavage site between the 
20 46 kDa protein and the cytoplasmic inclusion protein (CI) 
is believed to occur between the glutamine at codon 1164 
(nucleotides 3629-3631) and the glycine at codon 1165 
(nucleotides 3632-3634) . The cleavage site between CI 
and VPg/protease (VPg and protease are probably not 
25 separated in ZYMV) is believed to occur between the 

glutamine at codon 1798 (nucleotides 5531-5533) and the 
serine at codon 1799 (nucleotides 5534-5536) . The 
cleavage site between VPg/protease and RNA replicase 
(Rep) is believed to occur between the glutamine at codon 
2284 (nucleotides 6989-6991) and the serine at codon 2285 
(nucleotides 6992-6994) . The cleavage site between the 
RNA replicase and the coat protein (CP) is believed to 
occur between the glutamine at codon 2801 (nucleotides 
35 8540-8542) and the serine at codon 2802 (nucleotides 



30 
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8543-8545) . Termination of the polyprotein coincides . 
with termination of the coat protein and is believed to 
occur at the stop codon (nucleotides 9380-9382) following 
the glutamine at codon 3080. The 3' non- coding sequence 
5 then extends from nucleotide 9383 to nucleotide 9593 
before terminating in a polyadenylate sequence of 
variable length. cDNA clone pZRl contained approximately 
80 adenosines at its 3' terminus. ?■ 

10 pZProS 

A restriction fragment of 1666 base pairs (bp) 
extending between the PvuII and Sspl sites of ZYMV cDNA 
clone pZBli (described above) was isolated by agarose gel 
electrophoresis and ligated into the Smal site of pi asm id 

15 pTZ18U (Sambrook, J., et al. t Molecular Cloning . Cold 
Spring Harbor Laboratory, 1989; Mead D.A. , et al., 
Protein Engineering (1986) 1:67). This restriction 
fragment comprises a portion of the coding sequence of 
the ZYMV polyprotein which includes part of the 

20 cytoplasmic inclusion protein (CI) , the 6 kDa protein, 
the 49 kDa protease, and a portion of the Nib protein. 
Insertion of this fragment into the Smal site of pTZ18U 
places the reading frame encoding these proteins in phase 
with that of the expressible lacZa gene of pTZlBU such 

25 that expression of this gene from the lac promoter is 

expected to produce a fusion protein comprised of a small 
portion of the lacZa peptide fused to the amino terminus 
of the ZYMV polyprotein fragment. This construct was 
denoted pZProS and its structure was confirmed by 

30 dideoxynucleotide sequencing (Sanger, F., et al., Proc. 

Natl, Acad. Sei, nSA-<1977) 24:5463-5467). 

pZProg and pZEW 

35 
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The 2280 bp Sail restriction fragment from ZYMV 
cDNA clone pZRl {described above) , which comprises a 
portion of the ZYMV genome including part of the Nib 
protein, CP, the 3' non-coding sequence, and a portion of 
5 the polyadenylate sequence, was inserted into the Sail 
site of pZProS to create pZProS. Dideoxynucleotide 
sequencing of pZProS confirmed that the Nib/CP- encoding 
reading frame of the inserted fragment was in phase with 
the open reading frame of pZProS such that expression of 

10 this construct from the lac promoter is expected to 

produce a single polypeptide of approximately 140 kDa. 

In a further refinement of pZPro6,. the 1244 bp 
Mlul-Nael fragment was removed and the 705 bp MluI-EcoRV 
fragment from pZRl was inserted in its place, removing 

15 most of the ZYMV 3' non- coding and polyadenylate 

sequences, which include several unwanted restriction 
sites. This construct was denoted pZPro7. JL. £Qli 
strain DH5cr was transformed with pZPro6 and pZPro7, and 
transformed clones were identified and isolated by 

20 selection for ampicillin resistance. 

Expression of the lacZa-ZYMV polyprotein gene 
in pZPro6 pZPro7 was monitored by immunoblotting of 
SDS /PAGE -resolved proteins from these cells using 
polyclonal antisera raised in rabbits against denatured 

25 ZYMV coat protein (Burnette, W.N., Anal. Biochem. (1981) 
112:195). Results are shown in Figure 5B. Extract from 
cells harboring either pZProS or pZPro7 contained a 
single immunoreactive band which co-migrated with the 
major species of mature ZYMV coat protein at 

30 approximately 31 kDa. Since the CP -containing primary 

>•» translation product of 140 kDa was not detected in these 
extracts, the exclusive appearance of mature CP implies 
correct and efficient processing of the polyprotein by 

35 the ZYMV 49 )cDa protease. 
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To rule out the possibility that mature CP 
mighw have been produced either by an endogenous 1L. £211 
protease, or by fortuitous initiation of translation near 
the amino terminus of mature CP, a variant of pZPro7 was 
5 also analyzed. This variant, denoted placZa-CP, was made 
by deleting the sequence encoding the protease and most 
of the Nib protein from pZPro7, leaving part of the Nib 
protein and CP in phase with the lacZa peptide in a 46 
kDa open reading frame (see Figure 5A) . Extracts from 

10 cells harboring this construct contained a single 

immunoreactive band which migrated with an apparent MW of 
46 kDa (Figure 5B, lane 2) . The apparent absence in 
these cells of a species co-migrating with mature CP in 
the absence of the ZYMV 49 kDa protease indicates that 

15 the activity of the latter is indeed responsible for the 
occurrence of mature CP in cells harboring pZPro6 and 
pZPro7 . 



Example 2 

* 

20 Constructi on and analysis of genes which confer a 

negative phenotvne on B. coli cells by virtue of the 
activity of the ZYMV 49 kDa protease according to the 




above for Protease Inhibitor Selection 



25 This example describes the construction of 

expressible genes encoding polyproteins which contain the 
ZYMV 49 kDa protease and the JL. coli ribosomal protein 
S12. The ability of these gene constructs to confer 
sensitivity to the antibiotic streptomycin on several 

30 streptomycin- resistant sali strains by virtue of 

correct and efficient processing of the polyprotein by 
the ZYMV 49 kDa protease is demonstrated. 

Streptomycin lethality in £L_ £fili has been 

35 ascribed to its ability to interfere with protein 
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synthesis by binding to the S12 subunit of the 3 OS 
con^onent of the ribosame (Gorini, L., in RibQgQmes » M. 
Nomura, A, Tissieres, P. Lengyel, eds., Cold Spring 
Harbor Laboratory, 1974, pp. 791-803). Streptomycin- 
5 resistant mutants have been isolated which express 
altered forms of S12 which retain the ability to 
participate in the assembly of ribosames which can 
function in the presence of s t rep tony c in. Wildtype S12 
has been shown to confer a dominant streptomycin- 

10 sensitive phenotype on merodiploids which express both 
wildtype and streptomycin- resistant forms of S12. The 
highly sequestered position of S 12 in the ribosome 
suggests that S12- containing polyproteins should be too 
encumbered to participate in the assembly of functional 

15 ribosames. Thus, the expression of such polyproteins in 
streptomycin- resistant hosts should be unable to confer 
streptomycin sensitivity unless mature S12 can be 
proteolytically freed from the polyprotein. 

20 pZPro9 

The CP -encoding sequences in p2Pro7 were 
precisely replaced with the coding sequence for S12 to 
create pZPro9. This was accomplished as follows. The 
sequence bounded by the Bglll site in Nib and the Pi* 

25 position of the Nib/CP cleavage site in pZPro7 (Schecter, 
I., and A. Berger, Blochem. aiophva. Res. Comma*. (1967) 
22:157) was amplified by polymerase ch ai n reaction (PCR, 
Saiki, R.K., et al., ScisaCS (1988) 2JS.:487) . The S12 
coding sequence from the second amino acid to the end was 

30 amplified by PCR from plasmid pN01523 (Dean, D. , Sens 

(1981) i5.:99-l02) . The 3' primer contained an Mlul site 
following the stop codon. Following cleavage of the 
first PCR. product by Bglll and the second by Mlul both 

35 were simultaneously ligated into pZPro7 from which the 
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Bglll-Mlul fragment had been removed. Following 
transformation and plasmid DNA purification, the 
structure of pZPro9 (see Figure 6) was confirmed by 
dideoxynucleotide sequencing. 
5 pZPro9 was then transformed into streptomycin- 

resistant £L_ coli strains MC1009, HB101, and N100 
(American Type Culture Collection Catalogue of Bacteria 
and Phages, 1989) . After two rounds of single colony 
isolation in the presence of amp ic ill in (amp) , single 

10 colonies of each transformant were grown in Luria-Bertani 
medium (LB) containing 50 fig /ml amp to mid- log phase and 
plated on solid LB containing 100 fig /ml amp, 100 pg/ml 
streptomycin ( strep > , or both . 

Consistently, fewer than one in 10 5 amp- 

15 resistant colony- forming units (cfu) of each transformant 
was observed to grow in the presence of both amp and 
strep, while the same hosts harboring pZPro7 plated with 
similar efficiencies on amp alone or amp and strep (see 
Figure 6) . Thus, by virtue of having S12 in place of CP, 

20 pZPro9 is able to confer strep sensitivity on strep- 
resistant hosts, while its CP-containing parent, p2Pro7, 
is not. The pZPro9 transformants were fully sensitive to 
as little as 3 fig strep/ml while the parent strains were 
fully resistant to up to 350 /x/ml. Also, the pZPro9 

25 transformants plated equally well on strep alone or amp 
alone, indicating that pZPro9 is quickly lost in the 
absence of amp selection and that there is no discernible 
tendency to replace the strep- resistant gene in the host 
chromosome with the S12 gene by homologous recombination. 

30 To confirm that cleavage of the polyp rote in by 

the ZYMV 49 kDa protease to liberate mature S12 is 
required for strep sensitivity, the Nib/CP cleavage site 
was removed from pZPro9 to create pZProl2, which should 

35 produce a polyprotein from which S12 cannot be freed by 
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the protease. This was accomplished by cleaving pZPro9 
with BcoRV and Hpal, which removed most of the Nib 
protein including the Nib/CP cleavage site, and replacing 
it with the fragment produced by EcoRV alone, which 
5 restored most of the Nib protein down to within 12 amino 
acids of the Nib/CP cleavage site (see Figure 6) . The 
structure of pZProl2 was confirmed by dideoxynucleotide 
sequencing. Transformation with pZPro!2 has no ? 
discernible effect on the ability of strep -resistant 

10 coli strains to grow vigorously in the presence of up to 
350 /ig/ml streptomycin. Thus, the strep- sensitive 
phenotype produced by pZPro9 is completely dependent on 
the presence of a substrate cleavage site at which the 
protease can cleave functional S12 from the polyprotein. 

15 The expression of the pZPro9 polyprotein, like 

most large eucaryotic proteins, places a considerable 
burden on growing coli cells. In an attempt to reduce 
this burden, the pZPro9 was streamlined by removing the 
EcoRV fragment described above, which contains most of 

20 the Nib protein exclusive of the cleavage sites at either 
end. This construct, denoted pZProlO, encodes a 
polyprotein of about 83 kDa, of which 49 kDa is the 
protease and about 14 kDa is S12 (see Figure 6) . Upon 
transformation with pZProlO, strep- resistant JL_ £fili 

25 strains displayed a strep-sensitive phenotype identical 
to that of the pZPro9 transf ormants . In addition, 
pZProlO transf ormants grew considerably more vigorously 
than pZPro9 transf ormants, indicating a significant 
reduction in the metabolic burden on the host cells. 

30 Thus, removal of most of the Nib protein had no 

discernible effect on the efficiency of removal of 
functional S12 from the polyprotein by the protease. 

However, pZProlO- expressing cells still grew 

35 poorly compared to the untransf ormed host. Recent work 
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with another viral protease suggests that- this is 
probably due, at least in part, to fortuitous activity of 
the protease on host proteins (Bauxn, E.Z., et al., Proc T 
Natl. Aead- Set, USA (1990) £2:5573-5577). Inhibitors of 
5 the protease should at least partly restore normal 

growth • As this growth differential is relatively easy 
to score, it is possible to use the toxicity of the 
protease as the negative phenotype, and to select 
inhibitors from peptide libraries, or to confirm selected 
10 inhibitors on the basis of their ability to restore rapid 
growth . 

Once the protease removes itself from the 
polyprotein of pZProlO, the 14 kDa S12 is left in a 21.5 
kDa precursor until freed by the protease. To confirm 

15 that neither this precursor nor the polyprotein itself is 
able to confer strep sensitivity, the Nib/CP cleavage 
site was removed from pZProlO in the same manner that the 
EcoRV-Hpal restriction fragment was removed from pZPro9 
to create pZPro!2. This new construct, pZProll, shown in 

20 Figure 6, had no discernible effect on the level of strep 
resistance shown by strep- resistant E*. coli strains. 
Thus, again, the presence of a substrate cleavage site 
adjacent to S12 is required for strep sensitivity, 
implying that the ZYMV 49 kDa protease is specifically 

25 responsible for generating functional S12. 

Thus, systems for identifying and selecting 
protease inhibitors from peptide libraries have been 
disclosed. Although preferred embodiments of the subject 
invention have been described in some detail, it is 

30 understood that obvious variations can be made without 
departing from the spirit and the scope of the invention 
as defined by the appended claims. 
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20 
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30 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



60 



AAAATTGAAA CAAATCACAA ACACTACAAC AATCAACCAT CAAGCAAACG AATTTTTGAA 

CCTATTTACA AACAAGCAAT CTAAAACTCT TACAGTATTA AGAAATTCTC CAATCACTTC 120 

GTTTACTTCA GACATAACAA TGCCCTCCAT CATGATTGGT TCAATCTCTG TACCCATTGC 180 

AAAGACTGAG CAGTGTGCAA ACACTCAAGT AAGTAATCGG GCTAATATAG TGGCACCTGG 240 

CCACATCGCA ACATGCCCAT TCCCACTGAA AACGCACATG TATTACAGGC ATGAGTCCAA 300 

GAAGTTGATG CAATCAAACA AGAGCATTGA CATTCTGAAC AACTTCTTCA GCACTGACGA 360 

GATGAAGTTT AGGCTCACTC GAAACGAGAT GAGCAAGCTG AAAAAGGGTC CGAGCGGGAG 420 

GATAGTCCTC CGCAACCCGA GTAAGCAGCG GCTTTTCGCT CGTATCGAGC AGGATGAGGC 480 

accacgcaag gaacaggctg ttttcctcga AGGAAATTAT GACGATTCCA TCACAAATCT 540 

AGCACGTGTT CTTCCACCTG AAGTGACTCA CAACGTTGAT GTGAGCTTGC GATCACCGTT 600 

TTACAAGCGC ACATACAAGA AGGAAAGGAA GAAAGTGGCG CAAAAGCAAA TTGTCCAAGC 660 

ACCACTTAAT AOCTTGTGCA CACGTGTTCT TAAAATTGCA CGCAATAAAA ATATCCCTCT 720 

TGAGATGATT GGCAACAAGA AGGCGAGACA TACACTCACC TTCAAGAGGT TTAGGGGATG 780 

TTTTGTTGGA AAGGTCTCAG TTGCGCATGA AGAAGGACGA ATGCGGCACA CTGAGATGTC 840 

GTATGAGCAG TTTAAATGGC TTCTTAAAGC CATTTCTCAG GTCACCCATA CAGAGrCGAAT 900 

7CGTGAGGAA GATATTAAAC CAGGTTGTAG TGGGTGGGTG TTGGGCACTA ATCATACATT 960 

GACTAAAAGA TATTCAAGAT TCCCACATTT GCTGATTCGA GGTAGAGACC ACGATGGGAT 1020 

TGTGAACGCG CTGGAACAGC TGTTATTTTA TAGCGAAGTT GACCACTATT CGTOGCAACC 1080 

GGAAGTTCAG TTCTTCCAAG GATGGCGACG AATGTTTGAT AAGTTTAGGC CTAGCCCAGA 1140 



35 
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TCATCTCTCC AAACTTGACC ACAACAACGA GGAATGTGGT GACTTACCAG CAATCTTTTG 1200 

TCAGGCTCTA TTCCCACTAG TGAAACTATC CTCCCAAACA TGCAGAGAAA AOCTTAGTAC 1260 

AGTTAGCTTT GAGGAATTCA AAGATTCTTT GAACGCAAAC TTTATTATCC ACAAGGATGA 1320 

5 

ATGGGGTAGT TTCAAGGAAC GCTCTCAATA CGATAATATT TTCAAATTAA TCAAAGTGCC 1380 

AACACAGGCA ACTCAGAATC TCAAGCTCTC ATCTGAAGTT ATGAAATTAG TTCAGAACCA 1440 

CACAAGCACT CACATGAAGC AAATACAAGA CATCAATAAG GCGCTCATGA AAGGTTCATT 1500 

10 

GGTTGCGCAA GACGAATTGG ACTTAGCTTT GAAACAGCTT CTTGAAATGA CTCAGTGGTT 1560 

TAACAACCAC ATGCACCTCA CTGGTGAGGA GGCATTGAAG ATGTTCAGAA ATAAGCGTTC 1620 

TAGCAAGGCC ATGATAAATC CTAGCCTTCT ATGTGGCAAC CAATTGGACA AAAATGGAAA 1680 

15 

TTTT G TTT G G GGAGAAAGAG GATACCATTC CAAGCOATTA TTCAAGAACT TCTTCGAAGA 1740 

AGTAATACCA AGCG AAGGAT . ATACG AAGTA CGTAGTGCGA AACTTTCCAA ATGGTACTCG 1800 

TAAGTTGGCC ATAGGCTCAT TGATTGTACC ACTTAATTTG GATAGGGCAC GCACTGCACT 1860 

20 

ACTTGGAGAG AGTATTGAGA AGAAGCCACT CACATGAGCG TGTGTCTCCC AACAGAATGG 1920 

AAATTATATA CACTCATGCT GCTGTGTAAC GATGGATGAT GGAACCCCGA TGTACTCCGA 1980 

GCTTAAGAGC CCGACGAAGA GGCATCTAGT TATAGGAGCT TCTAGTGATC CAAAGTACAT 2040 

25 

, TGATCTGCCA GCATCTGAGG CAGAACGCAT GTATATAGCA AAGGAAGGTT ATTGCTATCT 2100 

CAGTATTTTC CTCGCAATGC TTGTAAATGT TAATGAGAAC GAAGCAAAGG ATTTCACCAA 2160 

AATGATTCGT GATGTTTTGA TCCCCATGCT TGGGCAGTGG CCTTCATTGA TGGATGTTGC 2220 

30 

AACTGCAGCA TATATTCTAO GTGTATTCCA TCCTGAAACG CGATGCGCTG AATTACCCAG 2280 

GATCCTTGTT GACCACGCTA CACAAACCAT GCATGTCATT G A TTCTTATG GATCACTAAC 2340 

35 

TGTTGGGTAT CACGTGCTCA AGGCTGGAAC TGTCAATCAT TTAATTCAAT TTGCCTCAAA 2400 
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15 



20 



25 



30 



35 



TGATCTGCAA ACCCAGATGA AACATTACAG AGTTCGTGGC ACACCAACAC ACCCCATTAA 2460 

ACTCOAGCAC CAGCTCATTA AAGGAATTTT CAAACCAAAA CTTATGATGC AGCTCCTGCA 2520 

TGATGACCCA TACATATTAT TACTTGGCAT GATTTCACCC ACCATTCTTG TACATATGTA 2580 

TAGGATGCGT CATTTTGAGC GGG6TATTCA GATATGGATT AAGAGGGATC ATGAAATCGG 2640 

AAAGATTTTC GTCATATTAG AGCAGCTCAC ACGCAAGGTT GCTCTGGCAG AAGTTCTTGT 2700 

GGATCAACTT AACTTGATAA GTCAAGCTTC ACCACATTTA CTTGAAATTA TGAAGGGTTG 2760 

XCAAGATAAT GAGAGGGCAT ACGTACCTGC GCTGGATTTG CTAACGATAC AAGTGGAGCG 2820 

TGAGTTTTCA AATAAAGAAC TCAAAACCAA TGGCTATCCA GATTTGCAGC AAACGCTCTT 2880 

CGATATGAGG GAAAAAATGT ATGCAAAOCA GCTGCACAAT TCATGGCAAG AGCTAAGCTT 2940 

GCTGGAAAAA TCCTGTGTAA CCGTGCGATT GAAGCAATTC TCGATTTTTA CGGAAAGAAA 3000 

TTTAATCCAG CGAGCAAAAG AAGGAAAGCG CGCATCTTCG CTACAATTTG TTCACOAGTG 3060 

TTTTATCACG ACCCGAGXAC ATGCGAAGAG CATTCGCGAT GCAGGCGTGC CTAAACTAAA 3X20 

TGAGGCTCTC GTCGGAACTT GTAAATTCTT TTTCTCTTGT GGTTTCAAAA TTTTTGCGCG 3X80 

ATGCTATAGC GACATAATAT ACCTTCTGAA OGTGTGTTTG GTTTTCTCCT TGGTGCTACA 3240 

AATGTCCAAT ACTGTGCGCA GTATGATAGC AGCGACAAGG GAAGAAAAAG AGAGAGOGAT 3300 

GGGAAATAAA GCTGATGAAA ATGAAAGGAC GTTAATGCAT ATGTACCACA TTTTCAGCAA 3360 

GAAACAGGAT GATGCGCCCA TATACAATGA CTTTCTTGAA CATGTGCGTA ATGTGAGACC 3420 

AGATCTTGAG GAAACTCTCT TGTACATGGC TGGCGTAGAA GTTGTTTCAA CACAGGCTAA 3480 

GTCAGCGGTT CAGATTCAAT TCGAGAAAAT TATAGCTGTG TTGGCGCTGC TTACCATGTG 3540 

CTTTGACGCC GAAAGAAGCG ATGCCATTTT CAAGATTTTG ACAAAACTCA AAACAGTTTT 3600 

TCGTACGGTT GGAGAAACGG TCCGACTTCA AGGGCTTGAA GACATTGAAA GCTTGGAGGA 3660 
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CGATAAAAGA CTCACAATTC A TT TTG ATAT TAACAC6AAC GAGCCTCAAT CGTCAACAAC 3720 



ATTTGATGTC CATTTTGATG ACTGGTGGAA TOGGCAACTA CAGCAAAATC GCAGAGTTCC 3760 

ACATTACAGG ACCACAGGCA AATTCCTTGA ATTTACGAGA AATACTGCAG CTTTTGTGGC 3840 

5 

CAATGAAATA GCATGATGAA GTGAGGGAGA GTTCTTAGTT AGAGGAGCAG TAGGTTCT O G 3900 



AAAATCAACG AGCTTACCTG CACATCTTGC CAAGAAGGGT AAGGTGTTAC TACTCGAACC 3960 



TACACGCCCT TTGGCGGAGA ATGTTAGTAG ACAGTTAGCA GGTGATCCTT TCTT TC AAAA 4020 

10 

CGTTACACTC AGAATGAGAG GGTTAAGTTG TTTTGGTTCA AGGAATATTA CAGTGATGAC 4080 



GAGTGGATTT C C TTTTCACT ACTATGTTAA CAATCCACAT GAATTGATGG AATTTGACTT 4140 



TGTCATCATA GACGAGTGCC ATGTCACAGA CAGTGCGACC ATAGCTTTCA ATTGTGCACT 4200 

15 

TAAAGAGTAC AACTTTGCTG GCAAATTGAT TAAAGTGTCT GCAACGCOGC CAGGGAGAGA 4260 



GTGCGATTTC GATACGCAAT TCGCGGTGAA AGTCAAAACA GAGGACCATC TTTCATTCCA 4320 



TGCATTCGTT GGCGGACAGA AGACTGGTTC AAATGCTGAC ATGGTTCAGC ATGGTAATAA 4380 

20 

CATACTTGTG TATGTTGCAA GTTACAACGA AGTGGACATG CTCTCTAAGT TACTCACTGA 4440 



GOGCGAATTT TCAGTTACAA AGGTAGATGG GCGAACAATG CAOCTTCCAA AAACTACCAT 4500 



TGAAAOGCAT GGAACTAGCC AAAAGCCCCA TTTCATAGTA GCTACAAACA TCATCGAGAA 4560 

25 

TGGAGTGAOG TTGGATGTTG AGTGTGTTGT TGATTTTGGA CTAAAAGTGG TOGCAOAACT 4620 



GGAGAGGGAA AATCGGTGTG TGCGCTACAA TAAGAAATCA GTTAGTTATG GAGAGAGGAT 4680 



TCAGCGACTA GGAAGAGTGG GGAGATCTAA GCCTGGAACT GCATTGCGTA TAGGGCACAC 4740 

30 

AGAAAAAGGC ATCGAAACGA TTCCTGAATT CATTGCCACA GAAGGAGGAG CCTTATCATT 4800 

TGGATATGGG CTTCCAGTGA CGACACATGG AGTTTCCAGA AATATACTTG GAAAGTGGAC 4860 

AGTTAAACAG ATGAAATGTG CTTTGAACTT TGAGCTAACT CCTTTCTTCA CCACTCATTT 4920 
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30 



35 



ATATGGTGTG CACCCTCACA ATTACACTTT TATCAGATTT GTGGACCCTC TCACIGGCCA 
TACATTGGAC GAAAGCACCC ATACAGACA? ATC6TTAGTG CAGOAGGAGT TTGGAAOTAT 
TAGAGAGAAA TTTCTGGAGA ATGATTTGAT CTCGAGGCAG TCTATTATCA ACAAACCCGG 
CATTCAGGCA TATTTTATGG GCAAGGGCAC TGAAGAAGCA CTCAAAGTTG ACTTGACTCC 
TCATGIACCA TTCCTTCIGT GCAOAAACAC CAAIGCIATT GCGGGATACC CAGAGACAGA 



4980 



5040 



5100 



5280 



AATCCGTCAT GATGGTAGTA TGCATCCACT AATACACGAA GAATTGAAGC AGTTCAAACT 

CAGGGATTCA GAAATGGTCC TCAACAAGGI TGCATTACCT CATCAATTTG TGAGCCAATG 

GATGGATCAA AGTGAGTATG AACGCATTGG AGTGCACOTT CAATGCCAIG AOAGCACACG 

CATACCTMT TACACAAATG GAATACCTGA TAAAGTCTAI GAOAOAATTT GGAAGTGCAT 5160 

ACAAGAAAAC AAGAACGATO CGGTTTTTGG XAAGCTTTCA AGTGCTTGTT CAACTAAGGT 5220 

TAGTXATACA CTTAOCACTO ATCCAGCAGC AWACCCAOA ACIAIIGCAA TCATOGATCA 

CCTGCTTGCC GAGGAAATGA TGAAGCGGAA ICACTTCGAC ACTATCAGCT CAGCTCTAAC 5340 

GGGCIATTCA 1U1LU.I I O CTGGAATTGC TCATTCTTTC AGGAAGAGAT ACATGCGCGA 5400 

TTACACAGCG CACAACATTG CAATTCTCCA ACAAGCACGT GCCCAGCTGC TTGAATITAA 5460 

IAGTAAGAAT GTGAACATTA ACAAICTGTC COATTTAGAA GGAATTGGAG TCATTAAOTC 5520 

GGTGGTGTTG CAAAOTAAGC AAOAGGTCAO CAGTTPCCTC GGACTTCGCG GTA*ATGGGA 5580 

TGGAAAGAAA TTTGCGAATO ATCTGATATT GGCGATTATO ACACTCTTAG GAGOTGGGTO 5640 

GTTCATGTCC GAATACTTCA CGAAAAAGAT CAATGAACCC GTGCGCGTTG AAAGCAACAA 5700 

ACGTCGATCT CAAAAATTOA AATTCAGGGA IGCGTACOAT AGAAAAOTTG GACGTGAOAT 5760 

TTCTGGTGAT GATGATACAA TTGGGCGCAC TTTCGGCGAA GCTTACACGA AGAGAGGAAA 5820 

GGTCAAAGGA AACAACAACA CAAAAGCAAT GGGACGGAAA ACTCGCAATT TTOTGCATTT 5880 

5940 



6000 



6060 



6120 



6180 
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15 



AAATGAGTTO AGACAAACTG GCACACCAGT CAAGGTTTC? TTTAAAGACC TGCCAGAGAA 6240 

AAACGAACAT GTCGACTTGG AGAGCAAATC TATCTACAAA GGAGTGCGCG ATTACAATGG 6300 

CATCTCAACA ATCGTTTGTC AATTAACGAA CGATTCTCAT GGCCTCAAGC AGACCATGTA 6360 

TGGTATTGGC TATGGGCCAA TAATCATCAC TAATGCACAC CTCTTCAGGA AAAACAATGG 6420 

CACACTTCTA GTCAGGTCTT GGCATGCTGA ATTCATTGTT AAAAATACCA CAACGCTCAA 6480 

AGTGCATTTC ATAGAAGGGA AGGATGTCGT GTTAGTGCGC ATGCCAAAGG ACTTTCCGCC 6540 

GTTTAAAAGC AACGCTTCTT TTAGGGCACC AAAACGCGAG GAACGACGAT CCTTGOTTGG 6600 

GACAAACTTT CAAGAAAAGA GTCTTCGCTC CACTGTTTCG GAATCTTCCA TGACAATACC 6660 

TCAAGGAACT GGCTCATATT GGATACATTG GATTTCGACC AACGAAGGGG ATTGCOGATT 6720 

GCCCATGGTT TCAACAACGG ATGGCAAGAT AATTGGAGTT CATGCTTTGG CTTCCACAGT 6780 



20 



25 



CTCATCTAAG AATTATTTTG TCCCATTCAC TGATGATTTT ATAGCCACGC ATTTOAGCAA 6840 

ACTTGATGAC CTCACATGGA CTCAGCATTG GCT ATGGCAA CCTAGCAAAA TTGCGTGGGG 6900 

AACGCTCAAC TTAGTTGATG AACAACCAOO GCCCGAATTT CGTATCTCAA ATCTAGTCAA 6960 

GOATTTATTC ACTTCTGGTG TTCAAACACA GAGCAAGCGA GAAAGATGGG TCTAOGAAAG 7020 

* 

CTGTGAAGGG AACCTTCCGG CTGTTGGAAC TGCACAATCA GCGTTAGTCA CCAAACATGT 7080 

TGTGAAAGGC AAGTGTCCTT TCTTCGAAGA ATATTTACAA ACACACOCAG AAGCGAGCGC 7140 

CTATTTCAGA CCCCTAATGG GAGAGTACCA GCCGAGCAAG TTGAACAAAG AAGCCTTTAA 7200 



30 
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AAAGGATTTC TTTAAATACA ATAAACCCGT CACTGTTAAC CAACTGGATC ATGATAAATT 7260 

TTTGGGAGCA GTGGATGGGG TTATACGTAT GATGTGTOAT TTTGAGTTCA ACGAATGTCG 7320 

ATTCATTACA GATCCCGAGG AAATTTACAA CTCTTTGAAC ATGAAAGCAG CAATTGGAGC 7380 

CCAGTATAGA GGAAAGAAGA AAGAGTATTT TGAGGGGCTA GATGATTTTG ATCGAGAGCG 7440 
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ACTTTTATTC CAAAGTTCTG AAAGGTTGTT CAATGGCTAC AAAGCTCTGT CGAATGCATC 7500 

TTTAAACCCC GAGCTCAGGC CCCTTGAGAA AGTCAGGGCT AACAAAACAC GAACCTTTAC 7560, 

AGCAGCGCCA ATTCATACAT TGCTTGCAGC TAAAGTTTGT GTGGATGATT TCAACAATGA 7620 

CTTCTACAGG AAAAACCTCA AGTGTCCATG GACGGTOGGC ATGACAAAAT TTTATGGTGG 7680 

TTGGGATAAA TTGATGAGAT CATTACCTGA TCGTTGCTTG TATTGTCATG CTGATQPATC 7740 

ACAGTTCGAT AGTTCGTTAA CCCCAGCCTT ACTGAACGCA GTGCTCATAA TCAGGTCATT 7800 

TTATATGGAG GATTGGTGGG TCGGCCAAGA GATGCTTGAA AATCTTTATG CCCAGATTGT 7860 

GTACACTCCA ATTCTTGCTC CTGATGGAAC AATTTTCAAG AAATTTAGAG GTAACAACAG 7920 

TGGGCAACCC TCAACAGTGG TGGATAACAC ACTAATGGTT GTGATCTCTA TTTACTATCC 7980 

GTGCATGAAA TTTGCTTGGA ACTGCGAGGA GATTGAGAAT AAACTTGTCT TCTTTGCAAA 8040 

TGGAGATGAT CTGATACTTG CAGTCAAAGA TGAGGATAGC GGCTTACTTC ATAACATGTC 8100 

ATCCTCTTTT TGCGAACTTG GACTGAATTA TG A TTTTTCA GAACGTACGC ATAAAAGAGA 8160 

AGATCTTTGG TTCKTGTCCC ACCXAGCAAT GCXAGTTGAT GGAATGTACA CTCCAAAACT 8220 

CGAGAAAGAG ACAATTCTTT CAATTCTAGA GTGGGATAGA AGCAAAGAAA TTATOCACCO 8280 

AACAGAGGCT ATTTGCGCTG CGATGATTGA GCCATCGCCG CACACCGAGC TCTTGCAAGA 8340 

AATCAGAAAG TTTTACCTAT GGTTCGTTGA AAAAGAAGAG GTGCGAGAAT TCCCAOCCCT 8400 

CGGAAAAGCT CCAIACATAG CTGAGACAGC ACTTCGTAAG TTATACACTG ACAAGGGACC 8460 

AGATACAAGT GAACTGGCAC GCTACCTACA AGCCCTCCAT CAAGATATCT TCTTTGAGC* 8520 

AGGAGACACT GTGATGCTCC AATCAGGCAC TCAGCCAACT GTGGCAGATG CTGGAGCTAC 8580 

AAAGAAAGAT AAAGAAGATG ACAAAGGGAA AAACAAGGAC GTTACACCCT CCGGCTCAGG 8640 

TGAGAAAACA GTAGCAGCTG TCACGAAGGA CAAGGATGTG AATGCTGGTT CTCATGGGAA 8700 



SUBSTITUTE SHEET 



WO 93/01305 



PCT/US92/05745 



-47- 



10 



15 



20 



AATT O TOCCG CGTCTTTCCA ACATCACAAA GAAAATCTCA TTGCCACGCG TGAAAGGAAA 8760 

TCTGATACTC CATATTCATC ATTTGCTOCA ATATAAACCG CATCAAATTC AGTTATATAA 8820 

CACACGAGCG TCTCATCAGC A6TTCGCCTC TTGGTTCAAC CAGGTTAAGA CGGAATATGA 8880 

TTTGAACGAG CAACAGATGG GAGTTGTAAT GAATGGTTTC ATGCTTTGGT GCATTGAGAA 8940 

TGGCACTTCA CCCGACATTA ATCGAGTGTG GGTTATGATG GACGGAAATG AGCAAGTTGA 9000 

GTATCCCTTG AAACCAATAG TTGAAAATGC AAAGCCAACG CTGCGGCAAA TAATGCATCA 9060 

TTTTTCAGAT GCAGCGGAGG CATATATAGA GATGAGAAAT GCACAGGCAC CATACATGCC 9120 

GAGGTATGGT TTGCTTCGAA ACCTACGGGA TAGGAGTTTA GCACGATATG CTTTT G ATTT 9180 

CTATGAAGTC AATTCTAAAA CTCCTGAAAG AGCCCGCGAA OCT C TT G COC AGATGAAAGC 9240 

AGCAGCTCTT AGCAATGTTT CTTCAAGGTT C TTT G OCCTT GATCGAAATG TTGCCACCAC 9300 

TAGCGAAGAC ACTGAACGGC ACACTGCACG TGATGTTAAT AQAAACATGC ACACCTTACT 9360 

AGGTGTGAAT ACAATGCAGT AAAGGGTAGG CCCCCTACCT AGGTTATTGT TTCGCT GCC G 9420 

ACOTAATTCT AATATTTACC GCTTTATTTG ATATCTTTAG ATTTCCAGAG TGGGCCTCCC 9480 

ACCTTTAAAG CGTAAAGTTT ATGTTAGTTG TCCAGGAOTG CCGTAGTCCT TTCGGAAGCT 9S40 

TTAGTGTGAG CCTCTCACGA ATAAGCTCGA GATTAGACTC CGTTTGCAAG OCT 9593 
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1. A method for detecting a protease 
inhibitor , said method comprising: 

5 (a) providing a population of host cells 

expressing a first nucleic acid sequence encoding a 
protease and a second nucleic acid sequence encoding a 
protein capable of conferring a selectable phenotype on 
said host cells dependent on the activity of said 
10 protease; 

(b) providing a pool of nucleic acid 
constructs wherein at least one of said constructs in 
said pool comprises a nucleic acid sequence encoding an 
inhibitor of said protease; 

15 (c) transforming said host cells of (a) with 

said nucleic acid constructs of (b) ; and 

(d) growing said transformed host cells of (c) 
under conditions that distinguish cells with said 
selectable phenotype, thereby detecting the presence of 

20 said protease inhibitor. 

2. The method of claim 1 wherein said host 
cells are bacterial cells. 

25 3. The method of claim 2 wherein said 

selectable phenotype is the ability of said bacterial 
cells to grow in the presence of a given antibiotic. 

4, The method of claim 3 wherein said second 
30 nucleic acid sequence comprises a nucleic acid sequence 
encoding coli ribosomal protein S12 and said 
antibiotic is streptomycin. 

35 
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5. The method of claim 2 wherein said first 
nucleic acid sequence comprises a nucleic acid sequence 
encoding ZYMV 49 kDa protease. 

6. The method of claim 2 wherein said 
selectable phenotype is the ability of said transformed 
host cells to grow in the presence of a given carbon 



10 7. The method of claim 1 wherein said second 

nucleic acid sequence comprises a nucleic acid sequence 
encoding a protein which is inactivated by said protease. 

8. The method of claim 1 wherein said second 
15 nucleic acid sequence comprises a nucleic acid sequence 

encoding a protein which is activated by said protease. 

9. A DNA construct comprising: 

(a) a first DNA coding sequence for a protein 
20 capable of conferring a selectable phenotype on a host 

cell transformed therewith, said selectable phenotype 
dependent on the activity of a protease; and 

(b) control sequences that are operably linked 
to said first and second coding sequences whereby said 

25 coding sequences can be transcribed and translated in a 
host cell, and at least one of said control sequences is 
heterologous to at least one of said coding sequences. 

10. The DNA construct of claim 9 further 
30 comprising a second DNA coding sequence for said 



35 



11. The DNA construct of claim 10 wherein said 
protease is ZYMV 49 kDa protease. 



SUBSTITUTE SHEET 



PCT/US92/05745 

WO 93/01305 



-SO- 



10 



15 



20 



25 



30 



» 



12. The DNA construct of claim 9 wherein said 
first DNA coding sequence codes for fifili ribosomal 
protein S12 and said selectable phenotype is streptomycin 



13. The DNA construct of claim 10 wherein said 
first DNA coding sequence codes for Cff l i ribosomal 
protein S12 and said selectable phenotype is streptomycin 



of claim 11 wherein said 
for £j_ coli ribosomal 
phenotype is streptomycin 



14. The DNA 
first DNA coding sequence 
protein S12 and said 



15. A DNA construct comprising: 

(a) a DNA coding sequence for a protein 
capable of inhibiting the action of a given protease, 
said protein identified by the method of claim 1; and 

(b) control sequences that are operably linked 
to said coding sequence whereby said coding sequence can 
be transcribed and translated in a host cell, and at 
least one of said control sequences is heterologous to at 
least said coding sequence. 

16. The DNA construct of claim 15 wherein said 
protease is ZYHV 49 kDa protease. 



17. A host cell stably 
construct according to claim 9. 

18. A host cell stably 
construct according to claim 10. 



transformed with a DNA 



transformed with a DNA 
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19. The host cell of claim 18 further 
transformed with a DNA construct comprising: 

(a) a DNA coding sequence for a protein 
capable of inhibiting the action of a given protease, 

5 said protein identified by a method comprising 

(i) providing a population of host cells 
expressing a first nucleic acid sequence encoding a 
protease and a second nucleic acid sequence encodirig a 
protein capable of conferring a selectable phenotype on 

10 said host cells dependent on the activity of said 
protease; 

(ii) providing a pool of nucleic acid 
constructs wherein at least one of said constructs in 
said pool comprises a nucleic acid sequence encoding an 

15 inhibitor of said protease; 

(iii) transforming said host cells of (i) 
with said nucleic acid constructs of (ii); and 

(iv) growing said transformed host cells 
of (iii) under conditions that distinguish cells with 

20 said selectable phenotype, thereby detecting the presence 
of said protease inhibitor; and 

(b) control sequences that are operably linked 
to said coding sequence whereby said coding sequence can 
be transcribed and translated in a host cell, and at 

25 least one of said control sequences is heterologous to at 
least said coding sequence. 

20. A host cell stably transformed with a DNA 
construct according to claim 11. 

30 

21. The host cell of claim 20 further 
transformed with a DNA construct comprising: 
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(a) a DNA coding sequence for a protein 
capable of inhibiting the action of a given protease, 
said protein identified by a method comprising 

(i) providing a population of host cells 
expressing a first nucleic acid sequence encoding a 
protease and a second nucleic acid sequence encoding a 
protein capable of conferring a selectable phenotype on 
said host cells dependent on the activity of said 



10 (ii) providing a pool of nucleic acid 

constructs wherein at least one of said constructs in 
said pool comprises a nucleic acid sequence encoding an 
inhibitor of said protease; 

(iii) transforming said host cells of (i) 
15 with said nucleic acid constructs of (ii) ; and 

(iv) growing said transformed host cells 
of (iii) under conditions that distinguish cells with 
said selectable phenotype, thereby detecting the presence 
of said protease inhibitor; and 

20 (b) control sequences that are operably linked 

to said coding sequence whereby said coding sequence can 
be transcribed and translated in a host cell, and at 
least one of said control sequences is heterologous to at 
least said coding sequence. 

25 

22. A host cell stably transformed with a DNA 
construct according to claim 12. 

23. A host cell stably transformed with a DMA 
30 construct according to claim 13. 

24. The host cell of claim 23 further 
transformed with a DNA construct comprising: 

35 



SUBSTITUTE SHEET 



WO 93/01305 



PCT/US92/05745 



-53- 



(a) a DNA coding sequence for a protein 
capable of inhibiting the action of a given protease, 
said protein identified by a method comprising 

(i) providing a population of host cells 
5 expressing a first nucleic acid sequence encoding a 
protease and a second nucleic acid sequence encoding a 
protein capable of conferring a selectable phenotype on 
said host cells dependent on the activity of said 
protease; 

10 (ii) providing a pool of nucleic acid 

constructs wherein at least one of said constructs in 
said pool comprises a nucleic acid sequence encoding an 
inhibitor of said protease; 

(iii) transforming said host cells of (i) 
15 with said nucleic acid constructs of (ii) ; and 

(iv) growing said transformed host cells 
of (iii) under conditions that distinguish cells with 
said selectable phenotype , thereby detecting the presence 
of said protease inhibitor; and 

20 (b) control sequences that are operably linked 

to said coding sequence whereby said coding sequence can 
be transcribed and translated in a host cell, and at 
least one of said control sequences is heterologous to at 
least said coding sequence. 

25 

25. A host cell stably transformed with a DNA 
construct according to claim 14. 

26. The host cell of claim 25 further 
30 transformed with a DNA construct comprising: 

(a) a DNA coding sequence for a protein 
capable of inhibiting the action of ZYMV 49 kDa protease, 
said protein identified by a method comprising 

35 
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(i) providing a population of host cells 
expressing a first nucleic acid sequence encoding a 
protease and a second nucleic acid sequence encoding a 
protein capable of conferring a selectable phenotype on 

5 said host cells dependent on the activity of said 
protease ; 

(ii) providing a pool of nucleic acid 
constructs wherein at least one of said constructs in 
said pool comprises a nucleic acid sequence encoding an 

10 inhibitor of said protease; 

(iil) transforming said host cells of (i) 
with said nucleic acid constructs of (ii) ; and 

(iv) growing said transformed host cells 
of (iii) under conditions that distinguish cells with 
15 said selectable phenotype, thereby detecting the presence 
of said protease inhibitor; and 

(b) control sequences that are operably linked 
to said coding sequence whereby said coding sequence can 
be transcribed and translated in a host cell, and at 
20 least one of said control sequences is heterologous to at 
least said coding sequence. 
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